Manufacturer
Interpretable Image Classification with Adaptive Prototype-based Vision Transformers
This method classifies an image by comparing it to a set of learned prototypes, providing explanations of the form "this looks like that." In our model, a prototype consists of parts, which can deform over irregular geometries to create a better comparison between images. Unlike existing models that rely on Convolutional Neural Network (CNN) backbones and spatially rigid prototypes, our model integrates Vision Transformer (ViT) backbones into prototype based models, while offering spatially deformed prototypes that not only accommodate geometric variations of objects but also provide coherent and clear prototypical feature representations with an adaptive number of prototypical parts. Our experiments show that our model can generally achieve higher performance than the existing prototype based models. Our comprehensive analyses ensure that the prototypes are consistent and the interpretations are faithful. Our code is available at https://github.com/Henrymachiyu/ProtoViT.
Dense Connector for MLLMs
Do we fully leverage the potential of visual encoder in Multimodal Large Language Models (MLLMs)? The recent outstanding performance of MLLMs in multimodal understanding has garnered broad attention from both academia and industry. In the current MLLM rat race, the focus seems to be predominantly on the linguistic side.
Score Distillation via Reparametrized DDIM
While 2D diffusion models generate realistic, high-detail images, 3D shape generation methods like Score Distillation Sampling (SDS) built on these 2D diffusion models produce cartoon-like, over-smoothed shapes. To help explain this discrepancy, we show that the image guidance used in Score Distillation can be understood as the velocity field of a 2D denoising generative process, up to the choice of a noise term. In particular, after a change of variables, SDS resembles a high-variance version of Denoising Diffusion Implicit Models (DDIM) with a differently-sampled noise term: SDS introduces noise i.i.d.
Faith and Fate: Limits of Transformers on Compositionality
Transformer large language models (LLMs) have sparked admiration for their exceptional performance on tasks that demand intricate multi-step reasoning. Yet, these models simultaneously show failures on surprisingly trivial problems. This begs the question: Are these errors incidental, or do they signal more substantial limitations? In an attempt to demystify transformer LLMs, we investigate the limits of these models across three representative compositional tasks--multi-digit multiplication, logic grid puzzles, and a classic dynamic programming problem. These tasks require breaking problems down into sub-steps and synthesizing these steps into a precise answer. We formulate compositional tasks as computation graphs to systematically quantify the level of complexity, and break down reasoning steps into intermediate sub-procedures. Our empirical findings suggest that transformer LLMs solve compositional tasks by reducing multi-step compositional reasoning into linearized subgraph matching, without necessarily developing systematic problem-solving skills. To round off our empirical study, we provide theoretical arguments on abstract multi-step reasoning problems that highlight how autoregressive generations' performance can rapidly decay with increased task complexity.
An NLP Benchmark Dataset for Assessing Corporate Climate Policy Engagement
As societal awareness of climate change grows, corporate climate policy engagements are attracting attention. We propose a dataset to estimate corporate climate policy engagement from various PDF-formatted documents. Our dataset comes from LobbyMap (a platform operated by global think tank InfluenceMap) that provides engagement categories and stances on the documents. To convert the LobbyMap data into the structured dataset, we developed a pipeline using text extraction and OCR. Our contributions are: (i) Building an NLP dataset including 10K documents on corporate climate policy engagement.
The Cybertruck was supposed to be apocalypse-proof. Can it even survive a trip to the grocery store?
The Cybertruck answers a question no one in the auto industry even thought to ask: what if there was a truck that a Chechen warlord couldn't possibly pass up โ a bulletproof, bioweapons-resistant, road rage-inducing street tank that's illegal to drive in most of the world? Few had seen anything quite like the Cybertruck when it was unveiled in 2019. Wrapped in an "ultra-hard, 30X, cold-rolled stainless steel exoskeleton", the Cybertruck was touted as the ultimate doomsday chariot โ a virtually indestructible, obtuse-angled, electrically powered behemoth that can repel handgun fire and outrun a Porsche while towing a Porsche, with enough juice leftover to power your house in the event of a blackout. At the launch, Tesla's CEO, Elon Musk, said the truck could tackle any terrain on Earth and possibly also on Mars โ and all for the low, low base price of 40,000. "Sometimes you get these late-civilization vibes [that the] apocalypse could come along at any moment," Musk said.
Trump Wants to Bring Back Factory Jobs. I Worked on the Assembly Line. It Was Hell.
Sign up for the Slatest to get the most insightful analysis, criticism, and advice out there, delivered to your inbox daily. I once witnessed a friend going through a severe midlife crisis. Basically overnight, this formerly serious and well-adjusted middle-aged man dumped his wife for a much younger girlfriend, got a face tattoo, and built a full-sized halfpipe in his house. Soon, we were barraged with music recommendations (all stuff he'd listened to in high school and college) and life updates laden with "hip" "slang" ("Despite the age gap, my situationship with Triniteigh is lowkey lit"). It was a transparent--and, from a certain perspective, even sympathetic--response to a universal anxiety: He'd seen that the good times were over, and that only decline lay ahead. But, like all nostalgists, he didn't realize that you can't ever truly go back; you can only go backward. The United States, under President Donald Trump, seems to be undergoing a similar midlife crisis, as this reactionary administration attempts to brute-force the country back to a golden age that many people are realizing either didn't exist in the first place or has been permanently lost to the mists of time and modernization.
Can new patrol vehicles crack down on 'video game-styled' driving in California?
The California Highway Patrol is deploying new patrol vehicles in hopes of cracking down on what the agency called "video game-styled" driving. The vehicles, 100 Dodge Durangos, will be paired with a fleet of Dodge Chargers and Ford Explorers to "observe the most reckless and dangerous behaviors without immediate detection," according to a CHP news release. "The new vehicles give our officers an important advantage," CHP Commissioner Sean Duryee said in a statement. "They will allow us to identify and stop drivers who are putting others at risk, while still showing a professional and visible presence once enforcement action is needed." The vehicles will be placed in various regions across the state starting this week.